TIPS: A Translingual Information Processing System
نویسندگان
چکیده
Searching online information is increasingly a daily activity for many people. The multilinguality of online content is also increasing (e.g. the proportion of English web users, which has been decreasing as a fraction the increasing population of web users, dipped below 50% in the summer of 2001). To improve the ability of an English speaker to search mutlilingual content, we built a system that supports cross-lingual search of an Arabic newswire collection and provides on demand translation of Arabic web pages into English. The cross-lingual search engine supports a fast search capability (sub-second response for typical queries) and achieves state-of-the-art performance in the high precision region of the result list. The on demand statistical machine translation uses the Direct Translation model along with a novel statistical Arabic Morphological Analyzer to yield state-of-the-art translation quality. The on demand SMT uses an efficient dynamic programming decoder that achieves reasonable speed for translating web documents.
منابع مشابه
Translingual Information Management by Natural Language Processing
Preface Translingual information management that can choose appropriate information and obtain useful knowledge from the ood of global information is being increasingly demanded. Among the related technologies, natural language processing is one of the most promising for meeting information needs because natural language processing can deal with documents that have essential roles in informatio...
متن کاملTowards Translingual Information Access Using Portable Information Extraction
We report on a small study undertaken to demonstrate the feasibility of combining portable information extraction with MT in order to support translingual information access. After describing the proposed system's usage scenario and system design, we describe our investigation of transferring information extraction techniques developed for English to Korean. We conclude with a brief discussion ...
متن کاملSite Method MIR TIR TIR / MIRCMU
We present an attempt at a coherent vision of an end-to-end translingual information retrieval system. We begin by presenting a sample of the broad range of possibilities, and the results of some initial work comparing the diierent approaches. We then present an overall workstation architecture, followed by two possible approaches to the actual translingual IR stage presented in detail. Ranking...
متن کاملTranslingual Information Retrieval: Learning from Bilingual Corpora
Translingual information retrieval (TLIR) consists of providing a query in one language and searching document collections in one or more diierent languages. This paper introduces new TLIR methods and reports on comparative TLIR experiments with these new methods and with previously reported ones in a realistic setting. Methods fall into two categories: query translation and statistical-IR appr...
متن کاملTranslingual Information Retrieval: Learning from Bilingual Corpora (ai Journal Special Issue: Best of Ijcai-97)
Translingual information retrieval (TLIR) consists of providing a query in one language and searching document collections in one or more diierent languages. This paper introduces new TLIR methods and reports on comparative TLIR experiments with these new methods and with previously reported ones in a realistic setting. Methods fall into two categories: query translation and statistical-IR appr...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2003